-
Notifications
You must be signed in to change notification settings - Fork 239
Simplify evaluation and prediction to mirror training datasets and create tests to assert expected eval behavior #1253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is being reviewed by Cursor Bugbot
Details
You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
| if len(self.sublist_lengths) > 0: | ||
| batch_sublist_lengths = self.sublist_lengths[batch_idx] | ||
| for idx, sub_idx in batch_sublist_lengths: | ||
| result = self.format_batch(batch[sub_idx], idx, sub_idx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong predictions accessed due to reset sub_idx in flattened batch
High Severity
In MultiImage.postprocess, batch[sub_idx] indexes into the flattened prediction results using sub_idx, which resets to 0 for each image in the batch (as stored in sublist_lengths). This causes predictions for the second and subsequent images to incorrectly retrieve prediction results from the first image. For example, with 2 images having 4 patches each, when processing the second image (idx=1), sub_idx values 0,1,2,3 cause batch[0:3] to be accessed instead of batch[4:7]. A running index into the flattened batch is needed instead.
Additional Locations (1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only a few comments for readability + needs rebase, but looks good. I using Lightning-y methods is the right direction to avoid confusion. Once this is in, I'll rebase #1256 which should make this even tidier.
|
|
71728ea to
ffcd713
Compare
|
@jveitchmichaelis , @ethanwhite and I just discussed this. In this case, when you are ready, git squash and merge and we will just eat these history changes. I have learned my lesson about how to do this correctly in the future. |
ethanwhite
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flagging to stop merge while I finish understanding the history complexities here
- Make trainer.validate the preferred evaluation method and standardize train, eval and predict to accept lists not batches. - Add a sublist concept for MultiImage datasets - Ensure predict_file follows the root_dir criteria for read_file
ffcd713 to
467edb9
Compare
| paths (List[str]): A list of image paths. | ||
| patch_size (int): Size of the patches to extract. | ||
| patch_overlap (float): Overlap between patches. | ||
| size (int): Target size to resize images to. Optional; if not provided, no resizing is performed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused arg in docstring L30 + reference at L21
|
@bw4sz re-reading this in light of our discussion on Thursday to make sure I'm clear. Is the idea that image size for prediction is solely set via patch_size + overlap? @ethanwhite if you're happy, could you approve changes to lift the merge block, please? Then I'll merge. |
jveitchmichaelis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. We should probably aim to improve our documentation on how sizes flow through the model as well, but not critical right now.
Yes, that's right, besides the internal retinanet resizing. That still applies for that model. |
AI-Assisted Development
AI tools used (if applicable):
I used cursor planning mode to help structure the tests, which i then edited and simplified.
Note
Major refactor to streamline augmentation, evaluation, prediction, and I/O with extensive tests.
kornia(removealbumentations), addZoomBlurandRandomPadTo, and switch dataset transforms toAugmentationSequentialtrainer.validatewith simplified mAP logging; deprecatemain.evaluate(internal__evaluate__retained)MultiImageand tiled rasters, and updatepredict/predict_tilepostprocessingutilities.read_fileand geospatial handling withDeepForest_DataFrame, explicitimage_path/label/root_dirassignment, and improved COCO/shape conversionsimage/root_dir(remove explicitwidth/height), and harden callbacks for empty annotationslog_root; crop model: add bboxexpandand dataset supporttraining,prediction,cropmodel) for new transforms, normalization, and box filtering; ensure 3-channel checkskorniadependency; tweakcodecov.ymlto mark patch status informationalWritten by Cursor Bugbot for commit 71728ea. This will update automatically on new commits. Configure here.